{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "x2Tgtmr1rhZS"
   },
   "source": [
    "## Overview: Waymo Open Dataset -- Perception Object Assets\n",
    "\n",
    "Modeling the 3D world from sensor data for simulation is a scalable way of developing testing and validation environments for robotic learning problems such as autonomous driving. We provide a large-scale object-centric asset dataset containing images and lidar observations of two major object categories (`VEHICLE` and `PEDESTRIAN`) from the released Perception data (v2.0.0). We hope this data will enable and advance research on 3D point cloud reconstruction and completion, object NeRF reconstruction, and generative object assets to address the real-world driving challenges with occlusions, lighting-variations, and long-tail distributions.\n",
    "\n",
    "Please familiarize yourself with the [Perception data v2 format and tutorial](https://github.com/waymo-research/waymo-open-dataset) then proceed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "CMNBf2z0460G"
   },
   "source": [
    "##Get Prepared\n",
    "\n",
    "As a prerequisite, please download the object asset components and the relevant [`LiDARBoxComponent`](src/waymo_open_dataset/v2/perception/box.py) for this tutorial.\n",
    "\n",
    "The object asset components include:\n",
    "* `ObjectAssetLiDARSensorComponent`: LiDAR points within each bounding box;\n",
    "* `ObjectAssetCameraSensorComponent`: Camera image patches, cropping config, and projected LiDAR points;\n",
    "* `ObjectAssetRayComponent`: Camera rays transformed to the bounding box coordinate frame;\n",
    "* `ObjectAssetAutoLabelComponent`: Dense auto-labels on the camera image patches;\n",
    "* `ObjectAssetRefinedPoseComponent`: Optional refined bounding box;\n",
    "* `ObjectAssetRayCompressedComponent`: A compressed copy of the corresponding `ObjectAssetRayComponent`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ABMQzgo6acMv"
   },
   "outputs": [],
   "source": [
    "#@title Access GCP Bucket\n",
    "\n",
    "from google.colab import auth\n",
    "import gcsfs\n",
    "import google.auth\n",
    "auth.authenticate_user()\n",
    "\n",
    "credentials, project_id = google.auth.default()\n",
    "fs = gcsfs.GCSFileSystem(project=project_id, token=credentials)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "IcfbLEezZ-4w"
   },
   "outputs": [],
   "source": [
    "#@title Load data from GCP Bucket\n",
    "\n",
    "from typing import Optional, Any, Mapping, Tuple\n",
    "import warnings\n",
    "# Disable annoying warnings from PyArrow using under the hood.\n",
    "warnings.simplefilter(action='ignore', category=FutureWarning)\n",
    "\n",
    "import dask.dataframe as dd\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import plotly  # Used by visu3d\n",
    "import tensorflow as tf\n",
    "import visu3d\n",
    "\n",
    "from waymo_open_dataset import v2\n",
    "from waymo_open_dataset.utils import camera_segmentation_utils\n",
    "object_asset_utils = v2.object_asset_utils\n",
    "\n",
    "#@markdown Specify data split\n",
    "data_split = 'validation' #@param [\"validation\", \"training\"]\n",
    "# Path to the directory with all components of perception dataset.\n",
    "dataset_dir = dataset_dir = f'waymo_open_dataset_v_2_0_0/{data_split}'\n",
    "# Path to the directory with all components of the perception object asset dataset.\n",
    "asset_dir = dataset_dir\n",
    "\n",
    "context_name = '4195774665746097799_7300_960_7320_960'\n",
    "\n",
    "def read(dataset_dir: str, tag: str) -> dd.DataFrame:\n",
    "  \"\"\"Creates a Dask DataFrame for the component specified by its tag.\"\"\"\n",
    "  paths = fs.glob(f'{dataset_dir}/{tag}/{context_name}.parquet')\n",
    "  return dd.read_parquet(paths, filesystem=fs)\n",
    "\n",
    "\n",
    "def read_codec(dataset_dir: str, tag: str) -> Any:\n",
    "  \"\"\"Reads the codec file for 3D ray decompression.\"\"\"\n",
    "  codec_path = fs.glob(f'{dataset_dir}/{tag}/codec_config.json')[0]\n",
    "  with tf.io.gfile.GFile(codec_path, 'r') as f:\n",
    "    codec_config = v2.ObjectAssetRayCodecConfig.from_json(f.read())\n",
    "  return v2.ObjectAssetRayCodec(codec_config)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "fVGBfyvyraAg"
   },
   "outputs": [],
   "source": [

   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "lhaJn6eq-u6H"
   },
   "outputs": [],
   "source": [
    "#@title List object assets from run\n",
    "\n",
    "#@markdown Select object category\n",
    "asset_type = 'veh' #@param [\"veh\", \"ped\"]\n",
    "\n",
    "asset_df = read(asset_dir, f'{asset_type}_asset_camera_sensor')\n",
    "grouped_asset_df = asset_df.groupby(['key.laser_object_id'])\n",
    "unique_object_df = grouped_asset_df['key.laser_object_id'].unique().compute()\n",
    "\n",
    "num_assets = unique_object_df.shape[0]\n",
    "print(f'Available {num_assets} unique objects:')\n",
    "for i in range(num_assets):\n",
    "  sel_asset_df = grouped_asset_df.get_group(\n",
    "      unique_object_df.iat[i][0])\n",
    "  num_frames = len(sel_asset_df)\n",
    "  _, r = next(sel_asset_df.iterrows())\n",
    "  camera_sensor_component = v2.ObjectAssetCameraSensorComponent.from_dict(r)\n",
    "  print(\n",
    "    'laser_object_id: ', camera_sensor_component.key.laser_object_id,\n",
    "    ' num_of_frames: ', num_frames)\n",
    "  plt.imshow(camera_sensor_component.rgb_image_numpy)\n",
    "  plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ALoVdVGWLCQ1"
   },
   "source": [
    "## Load and visualize camera fields"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "YGjN4T1yLGpT"
   },
   "outputs": [],
   "source": [
    "#@title Setup visualization functions\n",
    "def apply_color_mask(image: np.ndarray,\n",
    "                     mask: np.ndarray,\n",
    "                     color: Tuple[int, int, int],\n",
    "                     alpha: float = 0.5) -> np.ndarray:\n",
    "  \"\"\"Applies the given mask to the image.\"\"\"\n",
    "  color = np.array(color)[np.newaxis, :]\n",
    "  bg = image * (1 - alpha) + alpha * color\n",
    "  output = np.where(mask, bg, image).astype(np.uint8)\n",
    "  return output\n",
    "\n",
    "\n",
    "def grid_imshow(h: int, w: int, images: Any) -> None:\n",
    "  \"\"\"Displays images in a grid.\"\"\"\n",
    "  fig, axes = plt.subplots(h, w)\n",
    "  fig.set_size_inches(20, 10)\n",
    "  fig.tight_layout()\n",
    "  for i, image in enumerate(images):\n",
    "    ax = axes[i] if len(images) > 0 else axes\n",
    "    ax.imshow(image)\n",
    "    ax.axis('off')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "_eywESYvqywY"
   },
   "outputs": [],
   "source": [
    "#@title Select object asset\n",
    "asset_camera_sensor_df = read(asset_dir, f'{asset_type}_asset_camera_sensor')\n",
    "asset_ray_df = read(asset_dir, f'{asset_type}_asset_ray')\n",
    "asset_auto_label_df = read(asset_dir, f'{asset_type}_asset_auto_label')\n",
    "# Load additional LiDAR box dimensions to obtain the ray-box intersection\n",
    "laser_box_df = read(dataset_dir, 'lidar_box')\n",
    "\n",
    "asset_df = v2.merge(asset_camera_sensor_df, asset_ray_df)\n",
    "asset_df = v2.merge(asset_df, asset_auto_label_df)\n",
    "asset_df = v2.merge(asset_df, laser_box_df)\n",
    "\n",
    "grouped_asset_df = asset_df.groupby(['key.laser_object_id'])\n",
    "unique_object_df = grouped_asset_df['key.laser_object_id'].unique().compute()\n",
    "\n",
    "#@markdown Specify the asset index for loading\n",
    "sel_index = 0 #@param {type: \"integer\"}\n",
    "sel_asset_df = grouped_asset_df.get_group(\n",
    "    unique_object_df.iat[sel_index][0]).reset_index()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "wbiIT2nm61g1"
   },
   "outputs": [],
   "source": [
    "#@title Visualize camera patches\n",
    "\n",
    "def _compute_ray_mask(ray_origin: np.ndarray,\n",
    "                      ray_direction: np.ndarray,\n",
    "                      box_size: np.ndarray) -> np.ndarray:\n",
    "  im_height, im_width = ray_origin.shape[:2]\n",
    "  ray_mask, _, _ = object_asset_utils.get_ray_box_intersects(\n",
    "      ray_component.ray_origin.numpy.reshape(-1, 3),\n",
    "      ray_component.ray_direction.numpy.reshape(-1, 3),\n",
    "      lidar_box_component.box.size.numpy,\n",
    "  )\n",
    "  return ray_mask.reshape(im_height, im_width, -1)\n",
    "\n",
    "# Example how to access data fields.\n",
    "print(f'Available {sel_asset_df.shape[0].compute()} rows:')\n",
    "for i, (_, r) in enumerate(sel_asset_df.iterrows()):\n",
    "  # Create component dataclasses for the raw data\n",
    "  lidar_box_component = v2.LiDARBoxComponent.from_dict(r)\n",
    "  camera_sensor_component = v2.ObjectAssetCameraSensorComponent.from_dict(r)\n",
    "  auto_label_component = v2.ObjectAssetAutoLabelComponent.from_dict(r)\n",
    "  ray_component = v2.ObjectAssetRayComponent.from_dict(r)\n",
    "\n",
    "  ray_mask = _compute_ray_mask(\n",
    "      ray_component.ray_origin.numpy,\n",
    "      ray_component.ray_direction.numpy,\n",
    "      lidar_box_component.box.size.numpy)\n",
    "\n",
    "  panoptic_image = camera_segmentation_utils.panoptic_label_to_rgb(\n",
    "      semantic_label=auto_label_component.semantic_mask_numpy,\n",
    "      instance_label=auto_label_component.instance_mask_numpy)\n",
    "\n",
    "  print(\n",
    "      'context_name: ', lidar_box_component.key.segment_context_name,\n",
    "      ' ts: ', lidar_box_component.key.frame_timestamp_micros,\n",
    "      ' laser_object_id: ', lidar_box_component.key.laser_object_id)\n",
    "  grid_imshow(1, 5, [\n",
    "      camera_sensor_component.rgb_image_numpy,\n",
    "      apply_color_mask(\n",
    "          camera_sensor_component.rgb_image_numpy,\n",
    "          camera_sensor_component.proj_points_mask_numpy,\n",
    "          color=(0, 255, 0),\n",
    "          alpha=0.7),\n",
    "      apply_color_mask(\n",
    "          camera_sensor_component.rgb_image_numpy,\n",
    "          ray_mask,\n",
    "          color=(0, 255, 255),\n",
    "          alpha=0.3),\n",
    "      apply_color_mask(\n",
    "          camera_sensor_component.rgb_image_numpy,\n",
    "          auto_label_component.object_mask_numpy ,\n",
    "          color=(255, 255, 0),\n",
    "          alpha=0.3),\n",
    "          panoptic_image])\n",
    "  plt.show()\n",
    "  if i > 2:\n",
    "    print('...')\n",
    "    break\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JSXzC6101V07"
   },
   "source": [
    "We unproject each camera pixels to 3D and visualize the corresponding rays.\n",
    "Note that these images are captured with five cameras mounted on the car. Please\n",
    "refer to the Waymo Open Dataset -- Perception ([arxiv](https://arxiv.org/abs/1912.04838), [website](https://waymo.com/open/)) for details about camera sensor configurations.\n",
    "- `FRONT`\n",
    "- `FRONT_LEFT`\n",
    "- `FRONT_RIGHT`\n",
    "- `SIDE_LEFT`\n",
    "- `SIDE_RIGHT`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "VtdT7I7siz8O"
   },
   "source": [
    "### Visualization: object-centric coordinate\n",
    "\n",
    "Assuming objects in the **`VEHICLE`** category have rigid shape across all frames, we visualize any single frame and aggregated frames in the object-centric coordinate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "yrscKcuxrE4Y"
   },
   "outputs": [],
   "source": [
    "#@title Unproject pixels in 3D and camera rays (one frame)\n",
    "\n",
    "for i, (_, r) in enumerate(sel_asset_df.iterrows()):\n",
    "  lidar_box_component = v2.LiDARBoxComponent.from_dict(r)\n",
    "  camera_sensor_component = v2.ObjectAssetCameraSensorComponent.from_dict(r)\n",
    "  auto_label_component = v2.ObjectAssetAutoLabelComponent.from_dict(r)\n",
    "  ray_component = v2.ObjectAssetRayComponent.from_dict(r)\n",
    "\n",
    "  ray_dists = camera_sensor_component.proj_points_dist.numpy.reshape(-1, 1)\n",
    "  ray_origins = ray_component.ray_origin.numpy.reshape(-1, 3)\n",
    "  ray_directions = ray_component.ray_direction.numpy.reshape(-1, 3)\n",
    "  # Unproject pixels to 3D.\n",
    "  ray_points = ray_dists * ray_directions + ray_origins\n",
    "  ray_colors = camera_sensor_component.rgb_image_numpy.reshape(-1, 3)\n",
    "\n",
    "  ray_masks = camera_sensor_component.proj_points_mask_numpy.reshape(-1)\n",
    "  ray_masks *= auto_label_component.object_mask_numpy.reshape(-1)\n",
    "\n",
    "  if ray_masks.sum() > 0:\n",
    "    ray_ind = np.where(ray_masks > 0)[0]\n",
    "    v3d_points = visu3d.Point3d(\n",
    "        p=ray_points[ray_ind, :], rgb=ray_colors[ray_ind, :])\n",
    "    v3d_rays = visu3d.Ray(\n",
    "        pos=ray_origins[ray_ind, :], dir=ray_directions[ray_ind, :])\n",
    "    visu3d.make_fig([v3d_points, v3d_rays])\n",
    "  break\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "HDfauCAgitys"
   },
   "outputs": [],
   "source": [
    "#@title Unproject pixels in 3D and camera rays (aggregated frames)\n",
    "\n",
    "v3d_rays = []\n",
    "v3d_points = []\n",
    "\n",
    "for i, (_, r) in enumerate(sel_asset_df.iterrows()):\n",
    "  lidar_box_component = v2.LiDARBoxComponent.from_dict(r)\n",
    "  camera_sensor_component = v2.ObjectAssetCameraSensorComponent.from_dict(r)\n",
    "  auto_label_component = v2.ObjectAssetAutoLabelComponent.from_dict(r)\n",
    "  ray_component = v2.ObjectAssetRayComponent.from_dict(r)\n",
    "\n",
    "  ray_dists = camera_sensor_component.proj_points_dist.numpy.reshape(-1, 1)\n",
    "  ray_origins = ray_component.ray_origin.numpy.reshape(-1, 3)\n",
    "  ray_directions = ray_component.ray_direction.numpy.reshape(-1, 3)\n",
    "  # Unproject pixels to 3D.\n",
    "  ray_points = ray_dists * ray_directions + ray_origins\n",
    "  ray_colors = camera_sensor_component.rgb_image_numpy.reshape(-1, 3)\n",
    "\n",
    "  ray_masks = camera_sensor_component.proj_points_mask_numpy.reshape(-1)\n",
    "  ray_masks *= auto_label_component.object_mask_numpy.reshape(-1)\n",
    "\n",
    "  if ray_masks.sum() > 0:\n",
    "    ray_ind = np.where(ray_masks > 0)[0]\n",
    "    v3d_points.append(\n",
    "        visu3d.Point3d(\n",
    "            p=ray_points[ray_ind, :], rgb=ray_colors[ray_ind, :]))\n",
    "    v3d_rays.append(\n",
    "        visu3d.Ray(\n",
    "            pos=ray_origins[ray_ind, :], dir=ray_directions[ray_ind, :]))\n",
    "\n",
    "visu3d.make_fig(v3d_points + v3d_rays)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "p5J1J8rZSCFF"
   },
   "source": [
    "## Load and visualize point clouds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rXDDu7uAaM8g"
   },
   "outputs": [],
   "source": [
    "#@title Visualize lidar points\n",
    "asset_df = read(asset_dir, f'{asset_type}_asset_lidar_sensor')\n",
    "\n",
    "grouped_asset_df = asset_df.groupby(['key.laser_object_id'])\n",
    "unique_object_df = grouped_asset_df[\n",
    "    'key.laser_object_id'].unique().compute()\n",
    "\n",
    "sel_asset_df = grouped_asset_df.get_group(\n",
    "    unique_object_df.iat[sel_index][0]).reset_index()\n",
    "\n",
    "all_points_xyz = []\n",
    "for i, (_, r) in enumerate(sel_asset_df.iterrows()):\n",
    "  # Create component dataclasses for the raw data\n",
    "  lidar_sensor_component = v2.ObjectAssetLiDARSensorComponent.from_dict(r)\n",
    "  print(\n",
    "      f'context_name: {lidar_sensor_component.key.segment_context_name}',\n",
    "      f' ts: {lidar_sensor_component.key.frame_timestamp_micros}',\n",
    "      f' laser_object_id: {lidar_sensor_component.key.laser_object_id}')\n",
    "\n",
    "  all_points_xyz.append(lidar_sensor_component.points_xyz.numpy)\n",
    "  if i > 9:\n",
    "    break\n",
    "\n",
    "v3d_point_cloud = visu3d.Point3d(\n",
    "    p=np.concatenate(all_points_xyz, axis=0),\n",
    "    rgb=(255, 0, 0),\n",
    ")\n",
    "v3d_point_cloud.fig"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Gm3YNDY23IqF"
   },
   "source": [
    "Human labled 3D laser bounding boxes in the Waymo Open Dataset -- Perception\n",
    "are designed for the object detection task. Given a sequence of 3D laser bounding boxes, we further refine the boxes through 3D point cloud registration and provide the refined box pose (as 4x4 rigid transformation) per observation in the **`VEHICLE`** category.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "VveR4aDe43le"
   },
   "outputs": [],
   "source": [
    "#@title Visualize and compare lidar points (with refined box pose)\n",
    "\n",
    "asset_lidar_sensor_df = read(asset_dir, 'veh_asset_lidar_sensor')\n",
    "# Note: Registered Box Pose is only defined for vehicle objects.\n",
    "asset_refined_pose_df = read(asset_dir, 'veh_asset_refined_pose')\n",
    "\n",
    "# Load additional LiDAR box dimensions to obtain the ray-box intersection\n",
    "laser_box_df = read(dataset_dir, 'lidar_box')\n",
    "\n",
    "asset_df = v2.merge(asset_lidar_sensor_df, asset_refined_pose_df)\n",
    "asset_df = v2.merge(asset_df, laser_box_df)\n",
    "\n",
    "grouped_asset_df = asset_df.groupby(['key.laser_object_id'])\n",
    "unique_object_df = grouped_asset_df[\n",
    "    'key.laser_object_id'].unique().compute()\n",
    "\n",
    "sel_asset_df = grouped_asset_df.get_group(\n",
    "    unique_object_df.iat[sel_index][0]).reset_index()\n",
    "\n",
    "all_points_xyz = []\n",
    "for i, (_, r) in enumerate(sel_asset_df.iterrows()):\n",
    "  # Create component dataclasses for the raw data\n",
    "  lidar_sensor_component = v2.ObjectAssetLiDARSensorComponent.from_dict(r)\n",
    "  refined_pose_component = v2.ObjectAssetRefinedPoseComponent.from_dict(r)\n",
    "  lidar_box_component = v2.LiDARBoxComponent.from_dict(r)\n",
    "  print(\n",
    "      f'context_name: {lidar_sensor_component.key.segment_context_name}',\n",
    "      f' ts: {lidar_sensor_component.key.frame_timestamp_micros}',\n",
    "      f' laser_object_id: {lidar_sensor_component.key.laser_object_id}')\n",
    "  points_xyz = lidar_sensor_component.points_xyz.numpy\n",
    "  box_3d = lidar_box_component.box.numpy(dtype=np.float64)\n",
    "  refined_box_from_vehicle = np.asarray(\n",
    "      refined_pose_component.box_from_vehicle.transform,\n",
    "      dtype=np.float64).reshape(4, 4)\n",
    "  points_xyz_tfm = object_asset_utils.transform_points_to_box_coord_reverse(\n",
    "      points_xyz, box_3d)\n",
    "  points_xyz_refined = object_asset_utils.transform_points_to_frame(\n",
    "      points_xyz_tfm, refined_box_from_vehicle\n",
    "  )\n",
    "  all_points_xyz.append(points_xyz_refined)\n",
    "  if i > 9:\n",
    "    break\n",
    "\n",
    "v3d_point_cloud_refined = visu3d.Point3d(\n",
    "    p=np.concatenate(all_points_xyz, axis=0),\n",
    "    rgb=(0, 0, 255),\n",
    ")\n",
    "visu3d.make_fig([v3d_point_cloud_refined, v3d_point_cloud])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vst7-W-uiHQK"
   },
   "source": [
    "## Access compressed 3D rays\n",
    "\n",
    "Besides the accurate 3D rays for every camera pixel, we provide a compressed version for the corresponding 3D rays. This results in a significant reduction\n",
    "in dataset size, while introducing a neglible error (~1e-5 m) in ray origins and\n",
    "directions.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ZrXPBGENlD9U"
   },
   "outputs": [],
   "source": [
    "#@title Setup ray compressed component\n",
    "\n",
    "# The codec that is used to decompress the ray compressed component.\n",
    "codec = read_codec(asset_dir, f'{asset_type}_asset_ray_compressed')\n",
    "\n",
    "asset_ray_compressed_df = read(asset_dir, f'{asset_type}_asset_ray_compressed')\n",
    "asset_camera_sensor_df = read(asset_dir, f'{asset_type}_asset_camera_sensor')\n",
    "\n",
    "# Load additional LiDAR box dimensions to obtain the ray-box intersection\n",
    "laser_box_df = read(dataset_dir, 'lidar_box')\n",
    "\n",
    "asset_df = v2.merge(asset_camera_sensor_df, asset_ray_compressed_df)\n",
    "asset_df = v2.merge(asset_df, laser_box_df)\n",
    "\n",
    "grouped_asset_df = asset_df.groupby(['key.laser_object_id'])\n",
    "unique_object_df = grouped_asset_df['key.laser_object_id'].unique().compute()\n",
    "\n",
    "sel_asset_df = grouped_asset_df.get_group(\n",
    "    unique_object_df.iat[sel_index][0]).reset_index()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "TzU1r7fn1ECS"
   },
   "outputs": [],
   "source": [
    "#@title Unproject pixels in 3D and visualize compressed camera rays (one frame)\n",
    "\n",
    "it = sel_asset_df.iterrows()\n",
    "\n",
    "_, r = next(it)\n",
    "ray_compressed_component = v2.ObjectAssetRayCompressedComponent.from_dict(r)\n",
    "ray_component = codec.decode(ray_compressed_component)\n",
    "camera_sensor_component = v2.ObjectAssetCameraSensorComponent.from_dict(r)\n",
    "\n",
    "ray_dists = camera_sensor_component.proj_points_dist.numpy.reshape(-1, 1)\n",
    "ray_origins = ray_component.ray_origin.numpy.reshape(-1, 3)\n",
    "ray_directions = ray_component.ray_direction.numpy.reshape(-1, 3)\n",
    "\n",
    "ray_points = ray_dists * ray_directions + ray_origins\n",
    "ray_colors = camera_sensor_component.rgb_image_numpy.reshape(-1, 3)\n",
    "\n",
    "ray_masks = camera_sensor_component.proj_points_mask_numpy.reshape(-1)\n",
    "\n",
    "ray_ind = np.where(ray_masks > 0)[0]\n",
    "v3d_points = visu3d.Point3d(\n",
    "    p=ray_points[ray_ind, :], rgb=ray_colors[ray_ind, :])\n",
    "v3d_rays = visu3d.Ray(\n",
    "    pos=ray_origins[ray_ind, :], dir=ray_directions[ray_ind, :])\n",
    "visu3d.make_fig([v3d_points, v3d_rays])"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "private_outputs": true,
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
